Scalable switching fabric

ABSTRACT

A switch fabric includes a first plurality of data switches each having a plurality of input ports and a plurality of output ports the plurality of switches capable of switching any of its input ports to any of its output ports with the plurality of data switches having inputs coupled to a plurality of input buses so that a first byte of a first one of the input buses is coupled to a first one of the plurality of switches, and a succeeding byte of the first input bus is coupled to a succeeding one of the plurality of switches.

BACKGROUND

[0001] This invention relates to switching fabrics used to switch datain computer networks and other data moving applications.

[0002] Crossbars are one type of switching fabric used to switch databetween pluralities of devices. They can be thought of as a switch thathas a plurality of vertical paths interconnected by switching elementsto a plurality of horizontal paths in a manner that the switch elementscan interconnect any one of the vertical paths to any one of thehorizontal paths. Generally such crossbars are implemented with customapplication specific integrated circuits (ASIC's).

SUMMARY

[0003] According to an aspect of the present invention, a switch fabricincludes a network switch having a plurality of inputs and outputs and adistributed switching arrangement to provide a non-blocking switchingfabric capability over a series of byte sliced buses.

[0004] According to an additional aspect of the present invention, aswitch for coupling network devices to a network processor, includes aplurality of virtual queues and input segment logic coupled to at leastone bus, said input segment logic to determine to which virtual queueincoming data should be sent to and output segment logic to select whichnew virtual queue should be connected to an output port.

[0005] According to an additional aspect of the present invention, aswitch fabric includes a first plurality of data switches each having aplurality of input ports and a plurality of output ports the pluralityof switches capable of switching any of its input ports to any of itsoutput ports with the plurality of data switches having inputs coupledto a plurality of input buses so that a first byte of a first one of theinput buses is coupled to a first one of the plurality of switches, anda succeeding byte of the first input bus is coupled to a succeeding oneof the plurality of switches.

[0006] One or more of the following advantages may be provided by one ormore aspects of the invention.

[0007] A high performance, scalable switching fabric is provided forscaling a rotary switch for a multitude of ports. The rotary switch usesvirtual queuing providing the rotary switch controller (RSC) fullcrossbar capability, such that any of its input queues can couple to anyof its output queues without blocking. The RSC permits dynamicconfiguration of additional ports. The RSC is a modular concept allowinga switch to grow from e.g., 32 ports to 64 ports to 128 ports using apassive backplane.

BRIEF DESCRIPTION OF THE DRAWINGS

[0008]FIG. 1 is a block diagram of a network system including a rotaryswitch.

[0009]FIG. 2 is block diagram showing an implementation of the rotaryswitch.

[0010]FIG. 3 is a block diagram of the rotary switch.

[0011]FIG. 4A is a block diagram of a rotary switch coupled in a bytesliced configuration.

[0012]FIG. 4B is a block diagram of two rotary switches coupled in abyte sliced configuration.

[0013]FIG. 5A is a chart diagram showing byte mapping for a singlerotary switch of FIG. 4A.

[0014]FIG. 5B is a chart diagram showing byte mapping in the device ofFIG. 4B.

[0015]FIG. 6 is a block diagram showing structures used in arbitrationin the rotary switch.

DESCRIPTION

[0016] Referring to FIG. 1, a networked system 10 includes a rotaryswitch 12 transfering data from input ports to output ports in anon-blocking manner. For instance, the switch can be used for sendingpacket data at plurality of network processors 14 to network devices 16coupled to separate 32 bit FIFO busses. The rotary switch 12 includes adata path that is sliced or partitioned on an 8 bit (i.e. byte) basis toallow system 10 to be expanded from e.g., a 2×2 FIFO bus system to an8×8 FIFO bus system and so forth as described below. The rotary switch12 includes a plurality of 8-bit wide Virtual Input Queues (“VIQ”) 18that are distributed into input segments with each of 8 byte wide inputsegments coupled to 16 of the VIQ's 18. The rotary switch device 12 alsoincludes a plurality of Output Segments 20.

[0017] The rotary switch 12 also includes a switching fabric network 24which in combination with the Virtual Input Queues and output segmentlogic 20 can move byte wide data from any of the plurality of InputVirtual Queues 18 to an output FBUS byte without restricting the accessof any other input segment to any of Input Virtual Queues 18. Dataswitching is controlled by a arbiter 22.

[0018] The internal fabric 24 of the RSC 12 provides full input tooutput connectivity that is, a full of any and all of the inputs to anyand all of the outputs. In a exemplary rotary switch 12, there are 1288-bit wide Virtual Input Queues (“VIQ”), distributed over 8 segmentse.g., 16 VIQ's per segment and 8 output segments, thus providing adevice 12 having 128 input ports to any of the 128 output ports. Thefabric is an independently queued structure that does not requiresymmetric switching. The switching is a distributed function between aloose arrangement among processors 14 and the RSC fabric 24. The RSCarbiter 22 provides a fair round robin service for received packets. Theprocessors 14 can provide packets either through a simple round robin orweighted fair queuing to the RSC 12. The output port switching is basedon a PULL arbitration scheme.

[0019] Referring to FIG. 2, an implementation of the system 10 of FIG.1, is shown. A rotary switch device 12 is shown coupled to a pair ofnetwork processors 14. The network processors are preferablyparallel-based multithreaded processors. One example of such a processoris described in U.S. patent application entitled “PARALLEL PROCESSORARCHITECTURE”, filed on        by and assigned to the assignee of thepresent invention and incorporated herein by reference. Each of theprocessors 14 communicate with data supplying devices 13 e.g., hereMedia Access Controllers (MAC'S) that are coupled to the physical layerof a network 30.

[0020] The system 10 also includes a passive backplane 30. The passivebackplane 30 employs tri-state steering logic to enable dynamicreconfiguration of the system 10 based on the number of ports supported.This system 10 is a byte slice arrangement. As a byte slicedarrangement, when new ports are added, all ports stop transmitting toRSC devices 12. Depending on buffering and steering initialization time,input ports may or may not have to be paused. The passive backplaneincludes nine (9) main busses 30 a-30 b. The first bus 30 a is acomputer bus e.g., a Personal Computer Interconnect (PCI) bus. Whilethis is bridged bus, and therefore strictly speaking the backplane isnot passive, the bridge and microprocessor unit that is commonlyassociated with busses such as the PCI bus can be provided as a daughtercard to maintain a passive backplane 30. The other 8 busses on thebackplane are used to interconnect the RSC blades Blade_(—)0-Blade_(—)3.A RSC blade is an arrangement of RSC devices 12, network processors 14and network devices 16.

[0021] Since the FBUS data (32 bit Unidirectional busses) is sectionedinto up to four 8 bit segments. The segments can be sized based on thenumber of equivalent ports supported by the system 10. If a single bladeis used i.e., a 32 port system, the backplane steering logic connectsBus 30 b to Bus 30 e and Bus 30 i to Bus 30 f to provide input data,i.e., two 24 bit FBUS data busses to the RSC 12. These two busses alongwith immediate feedback of 8 bits each from Bus 30 b and Bus 30 i formtwo 32 bit input busses to the RSC 12. Bus 30 d is steered to Bus 30 cfor transmit to the MAC device 16. The 24 data bits of FBUS buses alongwith the 8 bits from the RSC 12 form 32 bits to the MAC. Similarly, Bus30 g is steered to Bus 30 h and merged with 8 bits of immediate feedbackdata from the RSC 12 to form 32 bits to the MACs 16.

[0022] Each blade e.g., Blade_(—)0 to Blade_(—)3 is similar inconstruction and are scaled together via the passive backplane 32. Thus,if system 10 has 64 ports supported, the steering logic (not shown)selects the appropriate bytes from each of the busses and connects themto their respective destination busses in 16 bit sections. If 128 portsare instantiated, then the section size is 8 bits. It should be notedthat a 64 port system can be configured into a 96 port system where thesection size is 8 bits, and the three RSC blades are used, with only 6input segments (instead of the possible 8). In this type ofconfiguration, a fourth RSC is required to complete the 32 bitbyte-slice. Therefore, 96 port systems require the use of a simpleadd-in board which is populated with only an RSC using bits (31:8) onboth input sections.

[0023] Referring to FIG. 3, the RSC 12 includes input segment logicdevices (ISL) 40 a-40 h that handles incoming FBUS data and distributesthe incoming data to an appropriate virtual input queue in virtual queuelogic (VQL) devices 42 a-42 p. The RSC 12 also includes output segmentlogic devices (OSL) 44 a-44 h. The OSL devices 44 pull data from the VQLlogic devices 42 and deliver the data to the output side of the FBUS fordistribution to appropriate MAC devices 14. The input virtual queues 42₀, to 42 ₁₂₇ are coupled to the output segment logic 44 via a series ofmultiplexers 47 ₀ to 47 ₁₂₇. Each of the multiplexers 42 ₀ to 42 ₁₂₇ iscoupled to each of the virtual input queues 42 in its row e.g., VIQ 42₀-42 ₁₁₂ for multiplexer 42 ₀). There are sixteen of said multiplexersin each column. The output of the multiplexers 47 ₀ to 47 ₁₂₇ for eachof the columns (e.g., multiplexer 47 ₀-47 ₁₅ for column 0) feedcorresponding output multiplexers 49 ₀ to 49 ₇ which are coupled to theoutput segment logic 44 a-44 h.

[0024] The RSC 12 also includes Input Ready logic 46 that samples thevirtual input queues in the virtual queue logic (VQL) devices 42 a-42 pto report back the status of the virtual input queues to mapped inputsegments, so that the devices 14 (FIG. 1) that supply data can trackbuffer fullness. The RSC 12 also includes output ready logic 48. TheOutput Ready Logic (ORL) 48 is analogous to the Input ready logic.However, the ORL 48 samples the network devices 16 e.g., MAC transmitready bits to determine if the network devices 16 are ready to acceptmore transmit data. The RSC 12 also includes Output Segment ArbitrationLogic 50. The Output Segment Arbitration Logic 50 as will be describedin FIG. 6, is used to determine which virtual input queue should providedata to its output segment in an appropriate timeslot. One preferredscheme has the Output Segment Arbitration Logic 50 using a round robintime multiplexed arbitration algorithm.

[0025] The Input Segment Logic (ISL) 40 interfaces with the networkprocessor 14 and determines which virtual queue an incoming mpkt (64byte payload) should be sent to. The RSC 12 has a plurality of virtualqueues. In one example, there are 16 virtual queues to which each inputsegment can direct incoming mpkts. The RSC 12 is arranged into segmentse.g., 8 input segments. If fewer input segments are required (i.e., theRSC 12 is configured for fewer ports), then logically contiguous inputsegments are joined to form either a 32 bit datapath (32 ports), or a 16bit datapath (64 ports).

[0026] The ISL 40 uses in-band information to control virtual queueloading and port arbitration. In-band information is used to minimizepin cost as would be associated with explicit out-of-band control. Thein-band information includes a destination output port (8 bits), an SOPbit, a “Transmit As Is” control bit, a byte enable control bit, a CSRenable, and a Virtual Queue Identifier (4 bits). Since, there is only 8bits of in-band data available per cycle, and there are 16 bitsrequired, 2 in-band cycles are required. There is an optimization, in 64or 32 port modes, 16 bits/32 bits of in-band information are possible,therefore only one in-band information cycle is required.

[0027] Input data to the crossbar includes two 32 bit header wordscontaining control information followed by up to eight (8) 32 bit wordsof packet data. The 32 bit input word is connected to 4 input segments.The header is segmented into 2 bytes of control for each input segment,and specifies the VIQ to load, the output destination, byte count, endof packet, and byte enable for the last 32 bit word being transmitted.The 4 output segments specified receive the VIQ addresses of new packetsbeing loaded in a “pending” output FIFO. All packets being sent to thesame output port are loaded into a similar “pending” FIFO so that allfour output byte segments begin to send data to the output FIFO bus onthe same cycle. Byte data from the 4 output segments is combined to formthe 32 bit output FIFO bus.

[0028] Two output segments per RSC are enabled to drive control signals(start of packet, end of packet, transmit as is, transmit error), whileall segments drive byte enable signals. The output control logic samplesthe ready signals for all output destinations. All output segmentsupdate their ready bit status in lock step so that the 4 sliced bytes ofthe input FIFO bus can be switch at the same time.

[0029] The inputs to the input segment logic 42 include the FBUS databits (7:0), and control signals, TxSel, EOP, and NewQHdr. The FBUS databits are as above, the TxSel bits are used to frame the FBUS data bitsas valid, whereas, EOP is used to explicitly identify the end of apacket. The NewQHdr bit indicates to the ISL 420 that a new set ofVirtual Queue information is coming. An optimization may be that if EOPAND NewQhdr are asserted this would require only a single prepend cycleto indicate a target Virtual Queue. The implication in that case is thatthe transfer is not a new packet but rather continuation data from acurrently transmitting packet. In this case an in-band EOP is required.

[0030] The Input Ready Logic (IRL) 46 samples the status of the 16virtual queues in virtual queue logic 42 a-42 p associated with eachinput segment. If a VIQ has available space then the IRL will reportthat to a requesting network processor 14 (FIG. 1), via VIQ transmitready bits. The network processor 14 can use this information toschedule transfers to the virtual queues.

[0031] The virtual queues VIQ's are associated with a particular outputwhile there is valid data maintained in the virtual queue. The virtualqueues can have a suitable storage depth e.g., 4 mpkts for 14 of thequeues and to 8 mpkts for two, where each mpkt is 64 bytes. There are 16Virtual queues associated with each input/output segment.

[0032] Other arrangements are possible. Each VIQ has an input pointerand an output pointer. The input pointer is used by the Input SegmentLogic 40 (ISL) to push data into the VIQ, while the output pointer isused by the output Segment Logic to “pull” data from the VIQ fordistribution out the transmit FBUS. In one implementation, the VIQ's aresingle ported random access memory devices. Since a read and write maybe concurrently required for the full crossbar operation, the VIQ's arecycled twice as fast as the input fill rate. For example, if the inputfill rate is 66-80 Mhz from the input segment FBUSES, the output drainrate would be a decoupled 66-80Mhz FBUS drain rate then the VIQ's wouldoperate at 133 to 166 MHZ that is twice as fast as the faster of theInput or Output FBUS rates. Alternatively, the queues can be organizedas 2 bytes in width and accessed on alternate cycles.

[0033] In order to maximize the efficiency of the Rotary switch 12, theswitch fabric operates at twice the output FBUS drain rate. One way toaccomplish this would be to cycle the VIQ's twice again as fast. Anotherway would be to have the VIQ's be twice as wide. Thus, if the VIQ'sinput segment is 8 bits wide the VA's are buffered to form 16 bits ofwrite data. Read operations will fetch 16 bits of read data which willbe supplied to the switch fabric at 133-160 Mhz 8 bit chunks.

[0034] The Output Segment Logic (OSL) 44 is a timeslot filler. TheOutput Segment Logic 44 uses Output Segment Arbitration 50 results toselect which new Virtual Queue should be “connected” to an output port.The OSL examines Transmit ready bits which are collected by the OutputReady bit Logic (ORL) 48 to determine if the output port is ready toaccept a new mpkt (64 bytes). The Output Segment Logic decouples the VIQcrossbar logic from the output drain rate, by employing a 16 mpkt queueat each output segment (16*64B*8=8 KB). This decoupling allows thecrossbar to operate at a higher frequency. The OSL 44 includes a 16entry timeslot queue. Each VIQ to Output Port has an explicit timeslotentry. If a VIQ is not available, its slot may be compressed. Up to nslots may be compressed (most likely n=2) before filling is “waitstated” until skipped VIQ's are available.

[0035] The Output Ready Logic (ORL) 48 interrogates the destinationnetwork devices 16 (FIG. 1) for transmit ready bits. The transmit readybits are used by the RSC 12 in the Output Segment Logic 44 (OSL) topromote data from the RSC 12 to the appropriate output segment FBUS 31.The ORL 48 is a ready bus Master. It cycles through all attached MAC'sfetching the transmit ready bits. The ORL 48 assembles all the transmitready bits and provides them to their respective output segment. The OSL48 uses these bits to determine if the tail of the queue should befilled with that output port's mpkt. This is done to avoid head of queueblocking.

[0036] The Output Segment Arbitration (OSA) 50 is used to link a virtualqueue 42 (VIQ) to an output port. The RSC 12 employs a distributedcrossbar selection scheme. The network processor 14 performs weightedfair queuing and provides the top elements for transmission to the RSC12. The RSC 12 in turn uses a fair service algorithm and a non-blockingscheme so that efficiency is maintained.

[0037] Switch Arrangements

[0038] Referring to FIG. 4A, a rotary switch 14 a, is coupled to providea 2×2 FIFO BUS switching fabric. The rotary switches 14 a is fed bybuses B0-B1 which are each 32 bit byte sliced busses. The output of therotary switch 14 a is coupled to output busses e.g., FBUS_(—)0,FBUS_(—)1. On the input side, the four bytes of each bus B0-B1 arecoupled in sequence to the rotary switch 14 a and on the output side thefirst four output segments of each rotary switch provide the bytes ofFBUS_(—)0, the next four output segments provide the bytes of FBUS_(—)1.The mapping for this arrangement is shown in FIG. 5A. In this manner abyte sliced architecture is provide. This byte sliced architecture, isnon-blocking. That is, any input port can be connected to any outputport without blocking any other input port from connecting to any otheroutput port. In any one cycle, all input ports can couple data todifferent ones of all of the output ports.

[0039] Referring to FIG. 4B, a pair of rotary switches 14 a, 14 b arecoupled to provide a 4×4 FIFO BUS switching fabric. The rotary switches14 a, 14 b have input segments coupled by buses B0-B3 which are each 32bit byte sliced buses. The output segments of the rotary switches 14 a,14 b are coupled to output busses e.g., FBUS_(—)0 to FBUS_(—)3. On theinput side, the first two bytes of each bus B0-B3 are coupled to theinput segments of the first rotary switch 14 a, whereas, the last twobytes of each bus are coupled to the input segments of the second rotaryswitch 14 b. On the output side the first two output segments of eachrotary switch provide the bytes of FBUS_(—)0, the next two outputsegments the bytes of FBUS_(—)1 and so forth. The mapping for thisarrangement is shown in FIG. 5B. In this manner a byte slicedarchitecture is provide. This byte sliced architecture, is a 4×4architecture and is non-blocking. That is, any input port can beconnected to any output port and not block any other input port fromconnecting to any other output port. In any one cycle, all input portscan couple data to different ones of all of the output ports.

[0040] Thus, rotary switches can be coupled to provide larger switchingfabrics. Four switches (a mapping of which is set out below) can becoupled such that eight, 4 byte buses could be coupled to the fourswitches with first bytes of a each bus coupled to the first switch,second bytes of each bus coupled to the second switch, third bytes ofeach bus coupled to the third switch and fourth bytes of each buscoupled to the fourth switch. Moreover, with larger rotary switchesi.e., that can interface to larger buses, e.g., 8 byte buses, evenlarger configurations could be provided in a similar manner. On theoutput side a similar connection arrangement is provided.

[0041] This switching fabric is scalable, i.e., easily expanded from a2×2 FIFO bus configuration (32 ports to 32 ports) up to an 8×8 FIFO busconfiguration (128 ports to 128 ports) without adding additionalhierarchial levels of switches. That is, expansion occurs on a singlelevel of switches which reduces latency and complexity.

[0042] Referring to FIG. 5A, the mapping as a 2×2 FIFO bus switchingfabric requires one RSC 12, with bytes output mapped as follows:

[0043] 1. output segment 0—byte 0 (bits(31:24)) output FIFO bus 0

[0044] 2. output segment 1—byte 1 (bits(23:16)) output FIFO bus 0

[0045] 3. output segment 2—byte 2 (bits(15:08)) output FIFO bus 0

[0046] 4. output segment 3—byte 3 (bits(07:00)) output FIFO bus 0

[0047] 5. output segment 4—byte 0 (bits(31:24)) output FIFO bus 1

[0048] 6. output segment 5—byte 1 (bits(23:16)) output FIFO bus 1

[0049] 7. output segment 6—byte 2 (bits(15:08)) output FIFO bus 1

[0050] 8. output segment 7—byte 3 (bits(07:00)) output FIFO bus 1

[0051] and the input segments mapped as;

[0052] 1. input segment 0—byte 0 (bits(31:24)) input FIFO bus 0

[0053] 2. input segment 1—byte 1 (bits(23:16)) input FIFO bus 0

[0054] 3. input segment 2—byte 2 (bits(15:08)) input FIFO bus 0

[0055] 4. input segment 3—byte 3 (bits(07:00)) input FIFO bus 0

[0056] 5. input segment 4—byte 0 (bits(31:24)) input FIFO bus 1

[0057] 6. input segment 5—byte 1 (bits(23:16)) input FIFO bus 1

[0058] 7. input segment 6—byte 2 (bits(15:08)) input FIFO bus 1

[0059] 8. input segment 7—byte 3 (bits(07:00)) input FIFO bus 1

[0060] where FBUS_(x)Y corresponds to byte “x” of FBUS “Y.” Thus, theoutput segments 0, 1, 2, 3 connect concurrently to corresponding VirtualQueues either in input segments 0, 1, 2, 3 or 4, 5, 6, 7, respectively.

[0061] Referring to FIG. 5B, to expand to a 4×4 FIFO bus switchingfabric requires two RSC devices 14. FIG. 5B shows the output mappingwith the output mapped as,

[0062] 1. RSC_(—)0 output segment 0—byte 0 (bits(31:24)) output FIFO bus0

[0063] 2. RSC_(—)0 output segment 1—byte 1 (bits(23:16)) output FIFO bus0

[0064] 3. RSC_(—)0 output segment 2—byte 0 (bits(31:24)) output FIFO bus1

[0065] 4. RSC_(—)0 output segment 3—byte 1 (bits(23:16)) output FIFO bus1

[0066] 5. RSC_(—)0 output segment 4—byte 0 (bits(31:24)) output FIFO bus2

[0067] 6. RSC_(—)0 output segment 5—byte 1 (bits(23:16)) output FIFO bus2

[0068] 7. RSC_(—)0 output segment 6—byte 0 (bits(31:24)) output FIFO bus3

[0069] 8. RSC_(—)0 output segment 7—byte 1 (bits(23:16)) output FIFO bus3

[0070] 9. RSC_(—)1 output segment 0—byte 2 (bits(15:08)) output FIFO bus0

[0071] 10. RSC_(—)1 output segment 1—byte 3 (bits(07:00)) output FIFObus 0

[0072] 11. RSC_(—)1 output segment 2—byte 2 (bits(15:08)) output FIFObus 1

[0073] 12. RSC_(—)1 output segment 3—byte 3 (bits(07:00)) output FIFObus 1

[0074] 13. RSC_(—)1 output segment 4—byte 2 (bits(15:08)) output FIFObus 2

[0075] 14. RSC_(—)1 output segment 5—byte 3 (bits(07:00)) output FIFObus 2

[0076] 15. RSC_(—)1 output segment 6—byte 2 (bits(15:08)) output FIFObus 3

[0077] 16. RSC_(—)1 output segment 7—byte 3 (bits(07:00)) output FIFObus 3

[0078] The input would be mapped in a similar manner (not shown in FIG.5B). The input segments mapped as;

[0079] 1. RSC_(—)0 input segment 0—byte 0 (bits(31:24)) input FIFO bus 0

[0080] 2. RSC_(—)0 input segment 1—byte 1 (bits(23:16)) input FIFO bus 0

[0081] 3. RSC_(—)0 input segment 2—byte 0 (bits(31:24)) input FIFO bus 1

[0082] 4. RSC_(—)0 input segment 3—byte 1 (bits(23:16)) input FIFO bus 1

[0083] 5. RSC_(—)0 input segment 4—byte 0 (bits(31:24)) input FIFO bus 2

[0084] 6. RSC_(—)0 input segment 5—byte 1 (bits(23:16)) input FIFO bus 2

[0085] 7. RSC_(—)0 input segment 6—byte 0 (bits(31:24)) input FIFO bus 3

[0086] 8. RSC_(—)0 input segment 7—byte 1 (bits(23:16)) input FIFO bus 3

[0087] 9. RSC_(—)1 input segment 0—byte 2 (bits(15:08)) input FIFO bus 0

[0088] 10. RSC_(—)1 input segment 1—byte 3 (bits(07:00)) input FIFO bus0

[0089] 11. RSC_(—)1 input segment 2—byte 2 (bits(15:08)) input FIFO bus1

[0090] 12. RSC_(—)1 input segment 3—byte 3 (bits(07:00)) input FIFO bus1

[0091] 13. RSC_(—)1 input segment 4—byte 2 (bits(15:08)) input FIFO bus2

[0092] 14. RSC_(—)1 input segment 5—byte 3 (bits(07:00)) input FIFO bus2

[0093] 15. RSC_(—)1 input segment 6—byte 2 (bits(15:08)) input FIFO bus3

[0094] 16. RSC_(—)1 input segment 7—byte 3 (bits(07:00)) input FIFO bus3

[0095] where in FIG. 5B, FBUS_(x)Y corresponds to byte “x” of FBUS “Y.”Thus output segments (RSC_(—)0 0,1/RSC_(—)1 0,1), (RSC_(—)0 2,3/RSC_(—)12,3), (RSC_(—)0 4,5/RSC_(—)1 4,5), (RSC_(—)0 6,7/RSC_(—)1 6,7)representing output FIFO busses 0, 1, 2, and 3 respectively, connectconcurrently to corresponding Virtual Input Queues in VIL 42 for theinput segments of input FIFO busses 0, 1, 2, and 3.

[0096] Thus, an 8×8 FIFO bus crossbar requires 4 RSC chips, with theoutput mapped as:

[0097] 1. RSC_(—)0 output segment 0—byte 0 (bits(31:24)) output FIFO bus0

[0098] 2. RSC_(—)0 output segment 1—byte 0 (bits(31:24)) output FIFO bus1

[0099] 3. RSC_(—)0 output segment 2—byte 0 (bits(31:24)) output FIFO bus2

[0100] 4. RSC_(—)0 output segment 3—byte 0 (bits(31:24)) output FIFO bus3

[0101] 5. RSC_(—)0 output segment 4—byte 0 (bits(31:24)) output FIFO bus4

[0102] 6. RSC_(—)0 output segment 5—byte 0 (bits(31:24)) output FIFO bus5

[0103] 7. RSC_(—)0 output segment 6—byte 0 (bits(31:24)) output FIFO bus6

[0104] 8. RSC_(—)0 output segment 7—byte 0 (bits(31:24)) output FIFO bus7

[0105] 9. RSC_(—)1 output segment 0—byte 1 (bits(23:16)) output FIFO bus0

[0106] 10. RSC_(—)1 output segment 1—byte 1 (bits(23:16)) output FIFObus 1

[0107] 11. RSC_(—)1 output segment 2—byte 1 (bits(32:16)) output FIFObus 2

[0108] 12. RSC_(—)1 output segment 3—byte 1 (bits(23:16)) output FIFObus 3

[0109] 13. RSC_(—)1 output segment 4—byte 1 (bits(23:16)) output FIFObus 4

[0110] 14. RSC_(—)1 output segment 5—byte 1 (bits(23:16)) output FIFObus 5

[0111] 15. RSC_(—)1 output segment 6—byte 1 (bits(23:16)) output FIFObus 6

[0112] 16. RSC_(—)1 output segment 7—byte 1 (bits(23:16)) output FIFObus 7

[0113] 17. RSC_(—)2 output segment 0—byte 2 (bits(15:08)) output FIFObus 0

[0114] 18. RSC_(—)2 output segment 1—byte 2 (bits(15:08)) output FIFObus 1

[0115] 18. RSC_(—)2 output segment 2—byte 2 (bits(15:08)) output FIFObus 2

[0116] 19. RSC_(—)2 output segment 3—byte 2 (bits(15:08)) output FIFObus 3

[0117] 20. RSC_(—)2 output segment 4—byte 2 (bits(15:08)) output FIFObus 4

[0118] 21. RSC_(—)2 output segment 5—byte 2 (bits(15:08)) output FIFObus 5

[0119] 22. RSC_(—)2 output segment 6—byte 2 (bits(15:08)) output FIFObus 6

[0120] 23. RSC_(—)2 output segment 7—byte 2 (bits(15:08)) output FIFObus 7

[0121] 24. RSC_(—)3 output segment 0—byte 3 (bits(07:00)) output FIFObus 0

[0122] 25. RSC_(—)3 output segment 1—byte 3 (bits(07:00)) output FIFObus 1

[0123] 26. RSC_(—)3 output segment 2—byte 3 (bits(07:00)) output FIFObus 2

[0124] 27. RSC_(—)3 output segment 3—byte 3 (bits(07:00)) output FIFObus 3

[0125] 28. RSC_(—)3 output segment 4—byte 3 (bits(07:00)) output FIFObus 4

[0126] 29. RSC_(—)3 output segment 5—byte 3 (bits(07:00)) output FIFObus 5

[0127] 30. RSC_(—)3 output segment 6—byte 3 (bits(07:00)) output FIFObus 6

[0128] 31. RSC_(—)3 output segment 7—byte 3 (bits(07:00)) output FIFObus 7

[0129] The input segments would be mapped as follows:

[0130] 1. RSC_(—)0 input segment 0—byte 0 (bits(31:24)) input FIFO bus 0

[0131] 2. RSC_(—)0 input segment 1—byte 0 (bits(31:24)) input FIFO bus 1

[0132] 3. RSC_(—)0 input segment 2—byte 0 (bits(31:24)) input FIFO bus 2

[0133] 4. RSC_(—)0 input segment 3—byte 0 (bits(31:24)) input FIFO bus 3

[0134] 5. RSC_(—)0 input segment 4—byte 0 (bits(31:24)) input FIFO bus 4

[0135] 6. RSC_(—)0 input segment 5—byte 0 (bits(31:24)) input FIFO bus 5

[0136] 7. RSC_(—)0 input segment 6—byte 0 (bits(31:24)) input FIFO bus 6

[0137] 8. RSC_(—)0 input segment 7—byte 0 (bits(31:24)) input FIFO bus 7

[0138] 9. RSC_(—)1 input segment 0—byte 1 (bits(23:16)) input FIFO bus 0

[0139] 10. RSC_(—)1 input segment 1—byte 1 (bits(23:16)) input FIFO bus1

[0140] 11. RSC_(—)1 input segment 2—byte 1 (bits(32:16)) input FIFO bus2

[0141] 12. RSC_(—)1 input segment 3—byte 1 (bits(23:16)) input FIFO bus3

[0142] 13. RSC_(—)1 input segment 4—byte 1 (bits(23:16)) input FIFO bus4

[0143] 14. RSC_(—)1 input segment 5—byte 1 (bits(23:16)) input FIFO bus5

[0144] 15. RSC_(—)1 input segment 6—byte 1 (bits(23:16)) input FIFO bus6

[0145] 16. RSC_(—)1 input segment 7—byte 1 (bits(23:16)) input FIFO bus7

[0146] 17. RSC_(—)2 input segment 0—byte 2 (bits(15:08)) input FIFO bus0

[0147] 18. RSC_(—)2 input segment 1—byte 2 (bits(15:08)) input FIFO bus1

[0148] 18. RSC_(—)2 input segment 2—byte 2 (bits(15:08)) input FIFO bus2

[0149] 19. RSC_(—)2 input segment 3—byte 2 (bits(15:08)) input FIFO bus3

[0150] 20. RSC_(—)2 input segment 4—byte 2 (bits(15:08)) input FIFO bus4

[0151] 21. RSC_(—)2 input segment 5—byte 2 (bits(15:08)) input FIFO bus5

[0152] 22. RSC_(—)2 input segment 6—byte 2 (bits(15:08)) input FIFO bus6

[0153] 23. RSC_(—)2 input segment 7—byte 2 (bits(15:08)) input FIFO bus7

[0154] 24. RSC_(—)3 input segment 0—byte 3 (bits(07:00)) input FIFO bus0

[0155] 25. RSC_(—)3 input segment 1—byte 3 (bits(07:00)) input FIFO bus1

[0156] 26. RSC_(—)3 input segment 2—byte 3 (bits(07:00)) input FIFO bus2

[0157] 27. RSC_(—)3 input segment 3—byte 3 (bits(07:00)) input FIFO bus3

[0158] 28. RSC_(—)3 input segment 4—byte 3 (bits(07:00)) input FIFO bus4

[0159] 29. RSC_(—)3 input segment 5—byte 3 (bits(07:00)) input FIFO bus5

[0160] 30. RSC_(—)3 input segment 6—byte 3 (bits(07:00)) input FIFO bus6

[0161] 31. RSC_(—)3 input segment 7—byte 3 (bits(07:00)) input FIFO bus7

[0162] For the 8×8 FIFO crossbar configuration each input/output segmentof the RSC 12 switches 1 byte of the 32 bit FIFO bus concurrent with theother RSC 12 slices.

[0163] Referring to FIG. 6, distribution of the Output SegmentArbitration Logic (OSA) 50 a-50 h is shown. Each of the OSA logicelements 50 determine which virtual queue 42 to link to which of theoutput segment logic 44 a-44 h. The input FBUS segments at a start of anew packet provides a start of packet “SOP” flag, the destination portof this new packet and a virtual queue number. At this starting pointthe destination port is known, so a physical mapping to the outputsegment logic is performed. This mapping is stored in an output port MapQueue 60. Each physical port has an output port Map Queue 60 ₀-60 ₁₂₇.These queues maintain pointers to the next virtual queue which has apacket for the port. Each Map Queue 60 maintains up to 8 entries (onefor each input segment). The entry has the VIQ# of the next packet to betransmitted.

[0164] When the Output Segment Logic 44 completes the transmission of anpacket to a particular port, the Output Segment Logic requests theOutput Arbitration Logic 50 to supply the Output Segment Logic 44 withthe NEXT_VQ number that is stored in that output ports' mapqueue_(60 NEXT) _(—) ^(VQ). This NEXT_VQ number is an address that theOutput Segment Logic 44 uses to control 8:1 multiplexers 62 a-62 h thatfeed the Output Segment Logic's 44 16 mpkt entry output queues. The VIQnumber is a 7 bit value. The three most significant bits indicate whichInput Segment and the least significant 4 bits indicate which VIQ withinthat input segment. Using these 7 bits the OSL can completely specifythe next packet to be transmitted.

[0165] Disposed between the Input Segment Logic (ISL) 40 and the outputport Map Queue 60 is a time division multiplex bus 70 used by the eightinput segments to transmit destination information to the selectedoutput segments. This multiplexing is straightforward since there are ata maximum 8 new packets to be sorted and 16 cycles to promote their VIQ#to their respective Map Queue. Therefore, by employing a very simpleround robin technique the input segment Destination ports are sortedover the next 8 cycles. The Map Queues drain using a first in first outalgorithm.

[0166] Other Embodiments

[0167] It is to be understood that while the invention has beendescribed in conjunction with the detailed description thereof, theforegoing description is intended to illustrate and not limit the scopeof the invention, which is defined by the scope of the appended claims.Other aspects, advantages, and modifications are within the scope of thefollowing claims.

What is claimed is:
 1. A switch fabric comprises: a network switchhaving a plurality of inputs and outputs; and a distributed switchingarrangement to provide a non-blocking switching fabric capability over aseries of byte sliced buses.
 2. The switching fabric of claim 1 whereinthe switch is a first switch and the switching fabric further comprises:a second network switch having a plurality of inputs and outputs.
 3. Theswitching fabric of claim 1 wherein the distributed switchingarrangement has the inputs of the first and second data switches coupledto a plurality of input buses so that a first byte of a first one of thebuses is coupled to the first switch, and a last byte of the first busis coupled to the second switch.
 4. The switching fabric of claim 1wherein the distributed switching arrangement has the outputs of thefirst and second data switches coupled to a plurality of output buses sothat a first byte of a first one of the output buses is coupled to thefirst switch, and a last byte of the first output bus is coupled to thesecond switch.
 5. A switch for coupling network devices to a networkprocessor, comprises: a plurality of virtual queues; input segment logiccoupled to at least one bus, said input segment logic to determine towhich virtual queue incoming data should be sent to; and output segmentlogic to select which new Virtual Queue should be connected to an outputport.
 6. The switch of claim 5 further comprising input ready logic todetermine whether the input queues can receive data.
 7. The switch ofclaim 5 wherein the input queues are coupled to the output queues by anon-blocking crossbar switching arrangement.
 8. The switch of claim 7wherein the non-blocking crossbar switching arrangement comprises: aplurality of multiplexers coupled to outputs of the plurality of theinput virtual queues to select ones of the virtual queues to fed to asecond plurality of multiplexers that produce inputs to the outputsegment logic.
 9. The switch of claim 8 further comprising: arbitrationlogic to select which of the virtual queues fed to the second pluralityof multiplexers to couple to the output segment logic.
 10. A switchfabric, comprises: a pair of data switches each having a plurality ofinput ports and a plurality of output ports the switches capable ofswitching any of its input ports to any of its output ports; said pairof data switches having inputs coupled to a plurality of input buses sothat a first byte of a first one of the buses is coupled to the firstswitch, and a last byte of the first bus is coupled to the secondswitch.
 11. The switch fabric of claim 10 wherein said pair of dataswitches having outputs coupled to a plurality of output buses so that afirst byte of a first one of the buses is coupled to the first switch,and a last byte of the first bus is coupled to the second switch. 12.The switch fabric of claim 10 wherein the pair of data switchescomprise: a plurality of virtual queues; input segment logic coupled tothe plurality of buses, said input segment logic to determines whichvirtual queue incoming data should be sent to.
 13. The switch fabric ofclaim 10 wherein the pair of data switches comprise: output segmentlogic coupled to the plurality of output buses, to select virtual queueshould be connected to an output port.
 14. The switch fabric of claim 10wherein the pair of data switches comprise: logic to control mapping ofbytes of the input bus to the input segment logic and mapping of bytesof the output segment logic to the output bus.
 15. A switch fabric,comprises: a first plurality of data switches each having a plurality ofinput ports and a plurality of output ports the plurality of switchescapable of switching any of its input ports to any of its output ports;said plurality of data switches having inputs coupled to a plurality ofinput buses so that a first byte of a first one of the input buses iscoupled to a first one of the plurality of switches, and a succeedingbyte of the first input bus is coupled to a succeeding one of theplurality of switches.
 16. The switch fabric of claim 15 wherein theplurality of switches are two and the succeeding byte is a third byte ofa four byte bus.
 17. The switch fabric of claim 16 wherein the switchescouple four byte sliced buses, and with the first byte and the secondbyte of each of the four buses being coupled to the first switch and thethird byte and fourth byte of each of the four buses being coupled tothe second switch.
 18. The switch fabric of claim 15 wherein theplurality of switches are four and the succeeding byte is a second byteof a four byte bus.
 19. The switch fabric of claim 16 wherein theswitches couple to eight byte sliced buses, and with the first byte ofeach of the eight buses being coupled to a first one of the switches, asecond byte of each of the eight buses being coupled to a second one ofthe switches, a third byte of each of the eight buses being coupled to athird one of the switches, and a fourth byte of each of the eight busesbeing coupled to a fourth one of the switches.
 20. The switch fabric ofclaim 10 wherein said pair of data switches have outputs coupled to aplurality of output buses so that a first byte of a first one of thebuses is coupled to the first switch, and a last byte of the first busis coupled to the second switch.